s1x0rteen                @         16        @               sICKSTEEN
                         @    hex and such   @
s1XXT33n                 @       ProZaq      @                 sixt33n


Are you a "newbie"?  As long as you're interested in not only computers but also in
what's making computers work the way they do, then you'll definitely need to learn
and master the meaning of a couple of basic expressions/ terms/ concepts.  Take, for
example, the hexadecimal number system; it doesn't matter if you want to learn the
basics of programing or if you want to write programs for Macs or PC's or you just
wanna cheat on some computer games; you have to learn and master it in order to be able
to "exploit" it.  And as you learn more you will notice that all these concepts are
interrelated and one can be manipulated to change the other.

In this file I shall try to explain the following topics: binary and hexadecimal numbers,
bytes/words/longs, ASCII characters, strings, HexEditors, the hardware components of a
computer, and debuggers.  If you find that you are not familiar with an expression, then
take a look in the "The Computer's Hardware Components" chapter.


--==< Binary, Decimal, and Hexadecimal Numbers >==--

Oh boy!  Where do I start?  Well, at the very, very, very beginning...
If I remember my IT classes well, the whole fame about binary numbers and calculations
with binary numbers goes to an English fellow named George Boole.  He developed amongst
others Boolian Algebra.  Remember all those horrible hours you had to spend in algebra
class learning formulas like: a(b+c) = a*b + a*c?  Well you have him to thank for it.  
e also developed a type of logic where he used ones and zeros to represent the logical
flow of an operation, which is the kind of logic that every personal computer chip uses
today.

You know how everyone is always saying  that computers are all about ones and zeros?
Well that's because everything in computers narrows down to being a one or a zero (an
electronic current or the lack of it).

But what on earth is the binary number system?  Well, let's try to define the decimal
number system first (the one we use in every day mathematics) since we're more familiar
with it.

The decimal number system is based on the number 10.  Twas the name "Decimal"; which
means "tenth" in Latin (doesn't "mal" mean "multiply" in German?).  You have the numbers
zero through nine.  When you start counting from zero up, you hit nine.  And what happens
when you hit ten?  You reset the value of the rightmost column (set it to zero), and
carry a one into the next column.   At one hundred you reset the two rightmost columns
and carry a one into the next one. And so on.  So as you notice you carry numbers at the
powers of ten.  Like 10^1 =10 (^ means raised to the power), 10^2 = 100, 10^3 = 1000, 10^4
= 10 000, 10^5 = 100 000, 10^6 =1 000 000 etc.    

Let's break the number "9876" into columns representing the numbers at which the carrying
occurs.  The "thousands", "hundreds", "tens", and "ones" column.

| Thousands  |  Hundreds  |    Tens    |    Ones    |
|   (10^4)   |   (10^3)   |   (10^2)   |   (10^1)   |
|      9     |      8     |      7     |      6     |

As you might have noticed, in order to get the number nine thousand eight hundred and
seventy six you multiply the value of each column with the appropriate multiple of ten
then add the values together (9*10^4 + 8*10^3 + 7*10^2 + 6*10^1).

In the binary number system we only have two numbers to work with instead of ten as we
had in decimal.  One and zero.  So this means, that instead of carrying numbers at the
powers of ten we carry numbers at the powers of two; namely: 2^1 = 2, 2^2 = 4, 2^3 = 8,
2^4 = 16, 2^5 = 32, 2^6 = 64, 2^7 = 126 and 2^8 = 256.

When dealing with binary a lot of times the value of all eight columns of numbers are
shown even if it is zero.  Makes the calculations easier.  For example, one in binary has
the value 1 but can also be written as 00000001.  

Here is a little chart showing the numbers one to sixteen in binary:

Value of column: 
 126 | 64 | 32 | 16 | 8 | 4 | 2 | 1 | = value of each column added up
  0     0    0    0   0   0   0   0   = 0
  0     0    0    0   0   0   0   1   = 1
  0     0    0    0   0   0   1   0   = 2 
  0     0    0    0   0   0   1   1   = 3
  0     0    0    0   0   1   0   0   = 4
  0     0    0    0   0   1   0   1   = 5
  0     0    0    0   0   1   1   0   = 6
  0     0    0    0   0   1   1   1   = 7
  0     0    0    0   1   0   0   0   = 8
  0     0    0    0   1   0   0   1   = 9
  0     0    0    0   1   0   1   0   = 10
  0     0    0    0   1   0   1   1   = 11
  0     0    0    0   1   1   0   0   = 12
  0     0    0    0   1   1   0   1   = 13
  0     0    0    0   1   1   1   0   = 14
  0     0    0    0   1   1   1   1   = 15
  0     0    0    1   0   0   0   0   = 16

Here's an other approach in trying to explain how binary works.  Try adding up the values
of the columns where there is a one.  In ten for example (00001010) there is a one in the
two's and the eight's column.  Thus when these values are added together (two plus eight)
we get ten.  The same goes for fifteen, there's a one in each column so, eight plus four,
plus two, plus one equals fifteen.

OK, now we've reached the hexadecimal numbers.  Well, for these suckers we carry at powers
of sixteen.  With other words we count from zero to fifteen before reseting the first
column and increasing the next.  The slight problem of only having ten numbers in our
everyday number system is compensated by using six alphabetical letters to represent the
numbers ten through fifteen.  Thus the numbers used in the hexadecimal number system have
the following notation:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F.

Once fifteen is reached, the next number (as always) is represented by reseting the first
column and increasing the next.  Meaning that sixteen in hex is "10".

If you have managed to get this far you've done a good job.  And if you still have
difficulties understanding what the different number systems are all about then I'll let
you in on a big secret.  Only a very few people convert between number systems in their
head. Most of us mortals rely on something called the "Scientific calculator".   This
makes life a lot simpler!  I always use a calculator simply because it's just so much
faster.  I believe that if you know the principles behind the different number systems
and you have access to a calculator that converts between these then you're set.

So now you know what different number systems are.  But when it comes to writing them down
some difficulties may arise.  It's obviously easy to distinguish numbers represented in
binary.  Just to be on the safe side, however, it's a convention to put a "%" sign in
front of binary numbers.  On the other hand "123" can be a number represented in both hex
and decimal form.  If it's a decimal number it's simply one hundred twenty three.  But if
it's a hexadecimal number then it has the decimal value of 291, two hundred ninety one.
Big difference there!  So how do you distinguish between hex and decimal numbers?  Well
the most common way is to represent hex numbers by putting a dollar sign, "$" in front of
the number. In the programing language C you represent decimal numbers using the "0x"
prefix.  In assembly language it is common practice to use the "#" sign when representing
decimal numbers.  I tend to be very lazy so when I want to represent decimal numbers I
just don't bother using any signs, but for hex numbers I always use the "$" sign.  For
example: #12345 (decimal) is $3039 (hexadecimal);  and $ABCDEF (hexadecimal) is 11259375
(still decimal if no sign is used).  Through the course of this file I will use this
method of notation.  I might, however, refer to hexadecimal numbers without the $ sign
if I think that it's obvious what I mean.


--==< Bytes, words, and longs >==--

Now that you know what hex is, there is a need to discuss the length of a number.  The
length of numbers have a large part when it comes to writing programs.  By using numbers
with different lengths the programmer can manipulate data much more easily.  Another
benefit of numbers with different lengths is that a small numbers will occupy a small
place in the memory instead of occupying an unnecessarily large one.  This is not much of
a problem now with the increase of of both RAM and HardDisk sizes, but back in the days
of C-64's and before, when programmers only had so much RAM to work with, it was very
important wether a number took up 1 or 4 bytes.

Anyway, in assembly language for the 68k Macintosh processors we talk about bytes, words
and longs.  A byte is two digits long and is between 00 and FF (0 to 255 in dec).  A word
is 4 digits long and is between 00 00 and FF FF (0 to 65535 in dec).  Finally a long is
made up of 8 digits and is between 00 00 00 00 and FF FF FF FF (0 to 4294967295 dec).  

With other words:

byte:      $00              - $FF                   #0 - #255
word:      $00 00           - $FF FF                #0 - #65535
long:      $00 00 00 00     - $FF FF FF FF          #0 - #4294967295

As you can see a byte takes up one fourth of the memory a long does.  This principle will
be discussed further in the chapter dealing with HexEditors.  I think it might be a good
thing for you to learn how many digits a byte, a word and a long has.  I will use these
expressions later on.  I chose to use these expressions (and not including floats and
doubles) because I feel that even an experienced person can get far with only these three
length-notations.

For those interested, the programing language C uses the following expressions to refer
to the length of numbers:

char                c = 'A';  // 1-byte long by definition (in C++).
short int           si= 1;    // minimum range +/-32767.
short               s = 2;    // short same as short int.
int                 i = 3;    // minimum range +/-32767.
long int            li= 4;    // minimum range +/-2147483647.
long                l = 5;    // long same as long int.
float               f = 10.1; // min 6 digits (decimal) precision.
double              d = 11.2; // min 10 digits (decimal) precision.
long double         ld= 12.3;

unsigned char       uc;       // unsigned integers can only store
unsigned short int  usi;      // positive numbers.
unsigned int        ui;
unsigned long int   uli;

signed char         sc;       // signed integers can store positive
signed short int    ssi;      // or negative numbers.
signed int          si2;
signed long int     sli;
(Information taken from "C Reference Card" by Argus Software Engineering)


--==< ASCII Characters >==--

With the arrival of networks reaching from one country to the other arose the problem of
character mapping.  When you push the letter "a" on your keyboard, the hardware components
of the computer send a number value to the processor which represents the letter "a".  But
how on earth would a computer in Yugoslavia, configured to deal with the Yugoslavian
alphabet, be able to interpret letter "ä" which is fairly common in the Swedish language.
To eliminate the problem a new standard for keyboards, the American Standard Code for
Information Interchange (ASCII) was adopted in most places.  What this means is that (in
theory at least) all alphabetical characters will appear the same way no matter where you
are in the world. Unfortunately this only works in theory, since different keyboards have
different mapping of different keys and have different ways of showing different letters
etc...  The good news is that just like you didn't have to know how to convert hex numbers
in your head, it's enough that you know that ASCII refers to the numerical values of the
different characters on your keyboard that the computer can interpret as such. 
 
Now you know that when you push a key on the keyboard, the corresponding number value is
sent to the processor (well in reality it's interpreted by the OS and sent to the active
application).  So, what is this number value?  Well, every character on the keyboard is
represented by a different number.  For example the English lowercase alphabetical
characters range from $61 to $7A (a-z).  Notice that when it comes to computers there's a
define difference between lowercase and uppercase letters.  Thus the uppercase English
letters are represented by the numbers $41 to $5A (A-Z). 

It is important to realize that every ASCII character (every character on the keyboard)
can be represented by a number that's the size of a byte.  Meaning a number between 1-255,
$1-FF. Thus the current standard of keyboard maps can only handle 255 characters.  

But that's of no real importance either.  The most common ASCII characters and their
values in both hex and decimal form are available in the included file "ASCII.txt"

Now then, we know that ASCII characters are represented by numbers.  For example the
capital letter "A" is represented by 65 ($41).  "B" is 66 ($42) and "C" is 67 ($43).  So
the letters "ABC" could be represented by the ASCII values 65 66 67 (or in hex 41 42 43).
And this brings us to our next topic, strings.


--==< Strings >==--

The expression "string" refers to a sequence of keyboard characters.  For example "Hello
world!" would be a string.  Notice that the computer doesn't care about the space between
the two words, it looks upon the sentence as only one string of characters.  This leads to
the problem of representing strings.  Imagine how a string would look like in the
computer's point of view.  It would be a sequence of numbers stored somewhere in the
memory.  And unless you inform the computer how to interpret the beginning or end of the
string, it will not know where the string ends.  

There are currently two standard ways of representing strings.  The C way and the Pascal
way. I'll start with the C way, it's easier.  Basically after the last character in the
string there is a zero-byte.  This means that a value of zero marks  the end of the string.
For example:
 
  H     E     L     L     O     _     W     O     R     L     D     !    •    
 72    69    76    76    79    95    87    79    82    76    68    33   00
$48   $45   $4c   $4c   $4f   $5f   $57   $4f   $52   $4c   $44   $21  $00

Keeping in mind that the size of an ASCII character is that of a byte (max 255)  we notice
that using the C method the length of the string is actual increased by one byte; the
zero-byte on the end.  When a program is in need of using the above string, it needs to
know the memory address of the first character, and it knows that it has hit the end of
the string when the value of the character is zero.  

The Pascal method is a bit different.  It stores the number of characters in the string
as the first byte.  The example above would be portrayed like this in Pascal notation:

     •     H     E     L     L     O     _     W     O     R     L     D     !               
    12    72    69    76    76    79    95    87    79    82    76    68    33 
   $0c   $48   $45   $4c   $4c   $4f   $5f   $57   $4f   $52   $4c   $44   $21  

As you might have noticed there are 12 characters in the string (including the "_" and
the "!" signs).  So using the Pascal method, the program would read the first byte of the
string and thus determine the lenght of it.

This whole concept will be developed further in the next chapter.


--==< Hex Editors >==--

NOTICE: When dealing with hex editors you are going to be changing real files on your
computer.  By changing just one byte in a file you can corrupt it to the extent that it
will not be usable any more!  So always make sure that you are working on a BACKUP of the
file. The easiest thing to do is to create a folder where you copy all the files that you
want to change with the HexEditor.

Remember how all data processed by the computer is made up of a one or a zero?  Well, the
same principle holds true for files stored on the hard disk, on a floppy disk, on a CD-ROM,
or on any other storage media.  But because hexadecimal numbers are easier to deal with
than binary numbers, we have programs that can read the content of any storage media as
pure hexadecimal data.  These programs are called HexEditors.  Using the above idea, any
file containing data that is stored on a media can be opened and it's contents will be
represented as hexadecimal numbers.  And it does not matter whether the file is an
application program or just a simple text file, since ALL files are at their "lowest
level" made up of binary numbers and can thus be viewed by a HexEditor.  

The first thing you have to do is to find yourself a HexEditing program.  It doesn't matter
which computer platform you have.  HexEditors exists for PC's, Mac's, Unix's, even C-64's
Once you've found a HexEditor open up any backup file with the program.  I have a Mac and I
use HexEdit 1.0.7, a freeware program by Jim Bumgardner.  If I open an application file I
get something like this:

(See the picture "HexEdit.jpg")

Please note that you WILL get something completely different, since the chances of us
opening the same file is very slim, and different HexEditors present the information in
different  ways.

Let me explain the above picture.  To the left you have the Offset column.  "Offset"
refers to the distance of a data from the first byte in the file.  Since the offset here
starts at zero we know that we are dealing with the beginning of the file.  Also notice
that the offsets are displayed as hex values.  A good HexEditor should be able to display
the offset as decimal numbers as well. 

In the middle you have the Hex column.  This is where all the hexadecimal data can be
found. If you converted all these numbers to binary, you'd have a representation of the
binary information of the file as you would find it on the Hard Drive.

Finally on the right side is the ASCII column.  This is an ASCII representation of the Hex
values.  This means that each hex number is looked up on an ASCII table and it's ASCII
value is displayed in this column.

OK, now what?  Well, as an example I'll describe the use of HexEditors as a way to cheat
on computer games.  

Off course you can not use a HexEditor to cheat on a game while you are playing it.  Those
situations will be dealt with in the next chapter.  What you can do with a HexEditor,
however, is to change saved games.  I mean, think about it.  What is the program actually
doing when it is saving a game?  It saves all the data about the game to a file.  Like
where you are positioned on the map, what items you carry, how many monsters are gonna
attack you etc...  In this example I will use Realmz, a shareware game for the MacOS.  

The first thing you have to do is to find where on the HardDrive the game saves it's
files. Some games allow you to save wherever you want, while others will only allow you
to save into a certain set of game folders (usually 1-10 or something like that).  So
search through the game's folders (directories as they are also called), and look for a
file that has the same name as your saved game. 

The next step is to find the document in which the game stores the information you want
to change.  For example Realmz is a Dungeons & Dragons game for the Mac where you can
create your own characters.  The attributes of the characters, such as it's strength or
stamina,  are saved in a file that has the same name as the character.  

Let us presume that I have a character called Pro.  His attributes are stored in the file
called "Pro".  I want to change my character's strength.  I want to make him stronger so
that he can cause more damage with each hit.  The first thing I would do is to run the
game and see how strong he is at that particular time.  This will be the value that the
game stores in the "Pro" file.  He has a strength  of 105.  So I convert this number to
hex, which gives me $69. And then I set out to look for the hex byte $69 in the saved file.
To make things easier I look for the hex word "00 69" since the possibility of the string
"00 69" appearing several times in the file is smaller than that of the string "69". (Read
"Note on HexEditors and numbers" for more information regarding this.)  When I've found
this value I change it to whatever I want it to be and then I save my work.  

The problem might arise that "00 69" appears in more than one places in the file.  The
easiest (and most dangerous way) is to change all the values to the value you want.  By
doing this, however, you might have changed values which are very important to the program
and might cause it to freeze.  By using a trial end error method you can try to change a
different value every time and see if the value you changed was the correct one.  The most
effective method, however, is to look at "00 69" in a context.  Meaning, look at the other
numbers around it. For instance, if you recognize the number after "00 69" as the movement
points of the character then there's a good chance that you're on the right track.

Note for Macintosh users:  The MacOS divides up a file into two parts, the data fork and
the resource fork.  Without getting too much into programing, here's what the purpose of
these two forks are.  The resource fork should contain information such as how a window
looks like, where it is located, how the menus look like etc.  With other words
information used by the Operating System.  The data fork should be used to store the
information used by the user's. For example, in a word processor file the resource fork
might contain information regarding the size of the window, while the data fork might
contain the actual text written by the user. However, the programmer is not obliged to
follow these criterias.  They are only suggestions made by Apple.  So, when you are
looking at a file with a HexEditor on a Mac, be sure to check both forks of the file for
the information you are looking for.


--==< Note On HexEditors And Numbers >==--

I find it appropriate to give a bit of a revision of numbers and strings.  
To use the example from above, let's presume that my character had the strength of $69.
What we don't know is how the program stores this number.  It might store it as a byte, a
word, or a long (see chapter about bytes, words and longs for more info about this).
Using common sense, if my character has a strength of $69 and is considered very very
strong than the program will probably save the value as a byte or a word.  It's completely
useless for it to store it as a long (although it might happen).  If, however, we regard
the characters experience point, its obvious that it is a lot larger than the range of a
word, so it HAS to be stored in a long (or something larger).   So instead of searching
for "ABCDE" you can search for "00 0A BC DE" which should narrow down the number of
occurrences of that number.

Another thing that needs to be discussed is that the length of a number has to be even.
A programmer deals with blocks (units) of memory.  The program then reserves these blocks
once it's launched.  The smallest block a programmer deals with is a byte.  This means
that no matter how much the programmer wants it, he/she can never store the number "1"
just like that. If it is to be stored in the memory it will be stored as "01".  However,
if the programmer assigned the number to be a word it will be stored as "00 01".  And if
it was assigned to be a long it will be stored as "00 00 00 01".  The computer doesn't
care what number is stored in the variable.  It only cares about the length of the
variable.  Thus if the computer stores three longs with the values $1, $22 and $333
respectively then it will look like this once you open the file with a HexEditor:

00 00 00 01 00 00 00 22 00 00 03 33

Lets say you want to change the $333 part to $433.  A good HexEditor might allow you to
search for "333" but remember that the smallest unit is a byte.  When you are changing
"00 00 03 33" to "00 00 04 33" it's pointless to change all 8 digits.  It's enough if you
change the 3'rd byte ("03" to "04").  Notice, however, that you can't just change 3 to 4.
You have to change "03" to "04".  A good HexEditor should actually not allow you to change
one digit at a time. It should require you to change one byte, 2 digits, at a time. If you
are confused then re-read this chapter, and the previous chapter dealing with lengths of
numbers.  This is important stuff, and it's very important that you know it well!